Journal of Pathology Informatics — Latest Matching Preprints

1

A human-in-the-loop explanation framework for morphologically transparent AI predictions from whole-slide images

Lou, P.; Zhu, Y.; Chia, N.; Kumari, R.; Yang, W.; Wang, Y.; Brenna, N.; Winham, S.; Guo, R.; Goode, E.; Huang, Y.; Han, W.; Feng, T.; Wang, C.

2026-01-29 pathology 10.64898/2026.01.27.701796 medRxiv

Top 0.1%

48.5%

Show abstract

Deep learning models enable the prediction of clinical endpoints from whole-slide images (WSIs), but many such models function as "black boxes", lacking transparency about whether and which histomorphological patterns drive their predictions, hindering interpretability and clinical adoption. Here we propose a human-in-the-loop explanation framework, MorphoXAI, which provides both local and global interpretability for deep learning models by incorporating human-expert interpretations. At the global level, it reveals the histomorphological patterns on which the model consistently relies to distinguish between classes of WSIs, as well as the patterns associated with confusion between classes. At the local level, it indicates which of these patterns are used in the prediction of an individual WSI and which regions within the slide correspond to such patterns. We validated our method on a deep learning model trained for ovarian tumor histologic subtype prediction. The results show that our framework generates explanations that accurately reflect the histomorphology underlying the models predictions at both global and local levels. For interpretability and clinical utility in diagnostic contexts, human evaluation results showed that our explanations were easy to interpret, rich in diagnostic features, and directly helpful for diagnostic decision-making, thereby enhancing pathologist-AI collaboration. Our work highlights that unifying global and local explanations and grounding them in expert-interpreted morphology enhances the interpretability and verifiability of deep learning models, thereby facilitating the transparent deployment of such models in clinical practice.

2

Tissue Region Segmentation In H&E-Stained Andihc-Stained Pathology Slides Of Specimens Fromdifferent Origins

Naghshineh Kani, S.; Soyak, B. C.; Gokce, M.; Duyar, Z.; Alicikus, H.; Yapicier, O.; Oner, M. U.

2025-01-17 pathology 10.1101/2025.01.16.25320663 medRxiv

Top 0.1%

45.0%

Show abstract

AO_SCPLOWBSTRACTC_SCPLOWWith the rise of digital pathology, integrating digital slides with deep learning-based decision support systems is becoming increasingly common in clinical practice. Tissue region segmentation which is distinguishing tissue from background/artefacts, is an important pre-requisite in many digital pathology pipelines both for the laboratories as their first step in digitalizing the glass slides of tissue samples and turning them to whole slide images (WSIs) using scanners, and also for DL researches such as region-of-interest cropping, tumor detection, cell segmentation. However, it is well known that WSI scanners can fail in detecting all tissue regions, due to the tissue type, or due to weak staining and this is because of their not robust enough tissue detection algorithms which makes segmentation of WSIs a challenging task. Hence, this study develops a fast, lightweight, accurate, CPU-ready DL approach, enabling fast and reliable tissue region segmentation model by training and testing it across seven different institutional H&E and IHC stained WSIs to result a strong in generalization with the 22 to 56 s/WSI inference time using CPU that markedly outperforms classical OTSU thresholding, particularly in preserving challenging or faint tissue regions by achieving notably higher and more consistent performance than OTSU, with median Jaccard and Dice scores of approximately 0.86 and 0.92, respectively, compared to OTSU whcih was between 0.56 and 0.72. Our approach provides a practical, open-source solution for resource-limited pathology settings. We publicly released dataset obtained from Bahcesehir Medical School, and code to foster benchmarking and further advances in efficient, deployable computational pathology. The model could be used in digital slide scanners to improve the scanning process and in the pre-processing stages of DL pipelines to prepare high-quality datasets.

3

A novel approach to classification and segmentation of colon cancer imaging towards personalized medicine

Harikrishnan, K.; Tarcar, A. K.; Botelho, N.; Kenkre, A.; Rebelo, P.

2023-07-08 pathology 10.1101/2023.07.07.23292356 medRxiv

Top 0.1%

44.8%

Show abstract

Recent advances in the field of pathology coupled with the rapid evolution of machine learning based techniques have revolutionized healthcare practices. Colorectal cancer accounts for one of the top 5 cancers with high incidence (126,240 in 2020) with a high mortality worldwide [1] [2]. Tissue biopsy remains to be the gold standard procedure for accurate diagnosis, treatment planning and prognosis prediction [3]. As an image based modality, pathology has attracted a lot of attention for development of AI algorithms and there has been a steady increase in the number of filings for FDA authorized use of AI algorithms in clinical practice [4]. The SemiCOL Challenge aims to develop computational pathology methods for automatic segmentation and classification of tumor and other tissue classes using H&E stained images. In this paper, we present a novel machine learning framework addressing the SemiCOL Challenge, focusing on semantic segmentation, segmentation-based whole-slide image classification, and effective use of limited annotated data. Our approach leverages deep learning techniques and incorporates data augmentation to improve the accuracy and efficiency of tumor tissue detection and classification in CRC. The proposed method achieves an average Dice score of 0.2785 for segmentation and an AUC score of 0.71 for classification across 20 whole-slide images. This framework has the potential to revolutionize the field of computational pathology, contributing to more efficient and accurate diagnostic tools for colorectal cancer.

4

Breast Tumor Cellularity Assessment using Deep Neural Networks

Rakhlin, A.; Tiulpin, A.; Shvets, A. A.; Kalinin, A. A.; Iglovikov, V. I.; Nikolenko, S.

2019-08-30 pathology 10.1101/628693 medRxiv

Top 0.1%

42.6%

Show abstract

Breast cancer is one of the main causes of death world-wide. Histopathological cellularity assessment of residual tumors in post-surgical tissues is used to analyze a tumors response to a therapy. Correct cellularity assessment increases the chances of getting an appropriate treatment and facilitates the patients survival. In current clinical practice, tumor cellularity is manually estimated by pathologists; this process is tedious and prone to errors or low agreement rates between assessors. In this work, we evaluated three strong novel Deep Learning-based approaches for automatic assessment of tumor cellularity from post-treated breast surgical specimens stained with hematoxylin and eosin. We validated the proposed methods on the BreastPathQ SPIE challenge dataset that consisted of 2395 image patches selected from whole slide images acquired from 64 patients. Compared to expert pathologist scoring, our best performing method yielded the Cohens kappa coefficient of 0.69 (vs. 0.42 previously known in literature) and the intra-class correlation coefficient of 0.89 (vs. 0.83). Our results suggest that Deep Learning-based methods have a significant potential to alleviate the burden on pathologists, enhance the diagnostic workflow, and, thereby, facilitate better clinical outcomes in breast cancer treatment.

5

Weakly supervised learning for poorly differentiated adenocarcinoma classification in gastric endoscopic submucosal dissection whole slide images

Tsuneki, M.; Ichihara, S.; Kanavati, F.

2022-05-29 pathology 10.1101/2022.05.28.22275729 medRxiv

Top 0.1%

42.1%

Show abstract

The endoscopic submucosal dissection (ESD) is the preferred technique for treating early gastric cancers including poorly differentiated adenocarcinoma without ulcerative findings. The histopathological classification of poorly differentiated adenocarcinoma including signet ring cell carcinoma is of pivotal importance for determining further optimum cancer treatment(s) and clinical outcomes. Because conventional diagnosis by pathologists using microscopes is time-consuming and limited in terms of human resources, it is very important to develop computer-aided techniques that can rapidly and accurately inspect large numbers of histopathological specimen whole-slide images (WSIs). Computational pathology applications which can assist pathologists in detecting and classifying gastric poorly differentiated adenocarcinoma from ESD WSIs would be of great benefit for routine histopathological diagnostic workflow. In this study, we trained the deep learning model to classify poorly differentiated adenocarcinoma in ESD WSIs by transfer and weakly supervised learning approaches. We evaluated the model on ESD, endoscopic biopsy, and surgical specimen WSI test sets, achieving and ROC-AUC up to 0.975 in gastric ESD test sets for poorly differentiated adenocarcinoma. The deep learning model developed in this study demonstrates the high promising potential of deployment in a routine practical gastric ESD histopathological diagnostic workflow as a computer-aided diagnosis system.

6

Graph Convolutional Neural Networks for Histological Classification of Pancreatic Cancer

Wu, W.; Liu, X.; Hamilton, R.; Suriawinata, A.; Hassanpour, S.

2022-01-28 pathology 10.1101/2022.01.26.22269832 medRxiv

Top 0.1%

41.1%

Show abstract

Pancreatic ductal adenocarcinoma has some of the worst prognostic outcomes among various cancer types. Detection of histologic patterns of pancreatic tumors is essential to predict prognosis and decide about the treatment for patients. This histologic classification can have a large degree of variability even among expert pathologists. This study proposes a graph convolutional network-based deep learning model to detect aggressive adenocarcinoma and less aggressive pancreatic tumors from benign cases. Our model uses a convolutional neural network to extract detailed information from every small region in a whole-slide image. Then, we use a graph architecture to aggregate the extracted features from these regions and their positional information to capture the whole-slide level structure and make the final prediction. We evaluated our model on an independent test set and achieved an F1 score of 0.85 for detecting neoplastic cells and ductal adenocarcinoma, significantly outperforming other baseline methods. If validated in prospective studies, this approach has a great potential to assist pathologists in identifying adenocarcinoma and other types of pancreatic tumors in clinical settings.

7

Enhancing Whole Slide Image Classification with Discriminative and Contrastive Learning

Liang, P.; Zheng, H.; Li, H.; Gong, Y.; Fan, Y.

2024-05-10 pathology 10.1101/2024.05.07.593019 medRxiv

Top 0.1%

40.7%

Show abstract

Whole slide image (WSI) classification plays a crucial role in digital pathology data analysis. However, the immense size of WSIs and the absence of fine-grained sub-region labels, such as patches, pose significant challenges for accurate WSI classification. Typical classification-driven deep learning methods often struggle to generate compact image representations, which can compromise the robustness of WSI classification. In this study, we address this challenge by incorporating both discriminative and contrastive learning techniques for WSI classification. Different from the extant contrastive learning methods for WSI classification that primarily assign pseudo labels to patches based on the WSI-level labels, our approach takes a different route to directly focus on constructing positive and negative samples at the WSI-level. Specifically, we select a subset of representative and informative patches to represent WSIs and create positive and negative samples at the WSI-level, allowing us to better capture WSI-level information and increase the likelihood of effectively learning informative features. Experimental results on two datasets and ablation studies have demonstrated that our method significantly improved the WSI classification performance compared to state-of-the-art deep learning methods and enabled learning of informative features that promoted robustness of the WSI classification.

8

A Deep Learning Approach for Histology-Based Nuclei Segmentation and Tumor Microenvironment Characterization

Rong, R.; Sheng, H.; Jin, K. W.; Wu, F.; Luo, D.; Wen, Z.; Tang, C.; Yang, D. M.; Jia, L.; Amgad, M.; Cooper, L. A. D.; Xie, Y.; Zhan, X.; Wang, S.; Xiao, G.

2022-12-12 pathology 10.1101/2022.12.08.519641 medRxiv

Top 0.1%

40.6%

Show abstract

Microscopic examination of pathology slides is essential to disease diagnosis and biomedical research; however, traditional manual examination of tissue slides is laborious and subjective. Tumor whole-slide image (WSI) scanning is becoming part of routine clinical procedure and produces massive data that capture tumor histological details at high resolution. Furthermore, the rapid development of deep learning algorithms has significantly increased the efficiency and accuracy of pathology image analysis. In light of this progress, digital pathology is fast becoming a powerful tool to assist pathologists. Studying tumor tissue and its surrounding microenvironment provides critical insight into tumor initiation, progression, metastasis, and potential therapeutic targets. Nuclei segmentation and classification are critical to pathology image analysis, especially in characterizing and quantifying the tumor microenvironment (TME). Computational algorithms have been developed for nuclei segmentation and TME quantification within image patches; however, existing algorithms are computationally intensive and time-consuming for WSI analysis. In this study, we present Histology-based Detection using Yolo (HD-Yolo), a new method that significantly accelerates nuclei segmentation and TME quantification. We demonstrate that HD-Yolo outperforms existing methods for WSI analysis in nuclei detection and classification accuracy, as well as computation time.

9

LiteMIL: A Computationally Efficient Transformer-Based MIL for Cancer Subtyping on Whole Slide Images.

Kussaibi, H.

2025-05-12 pathology 10.1101/2025.05.11.25327389 medRxiv

Top 0.1%

40.5%

Show abstract

PurposeAccurate cancer subtyping is crucial for effective treatment; however, it presents challenges due to overlapping morphology and variability among pathologists. Although deep learning (DL) methods have shown potential, their application to gigapixel whole slide images (WSIs) is often hindered by high computational demands and the need for efficient, context-aware feature aggregation. This study introduces LiteMIL, a computationally efficient transformer-based multiple instance learning (MIL) network combined with Phikon, a pathology-tuned self-supervised feature extractor, for robust and scalable cancer subtyping on WSIs. MethodsInitially, patches were extracted from TCGA-THYM dataset (242 WSIs, six subtypes) and subsequently fed in real-time to Phikon for feature extraction. To train MILs, features were arranged into uniform bags using a chunking strategy that maintains tissue context while increasing training data. LiteMIL utilizes a learnable query vector within an optimized multi-head attention module for effective feature aggregation. The models performance was evaluated against established MIL methods on the Thymic Dataset and three additional TCGA datasets (breast, lung, and kidney cancer). ResultsLiteMIL achieved 0.89 {+/-} 0.01 F1 score and 0.99 AUC on Thymic dataset, outperforming other MILs. LiteMIL demonstrated strong generalizability across the external datasets, scoring the best on breast and kidney cancer datasets. Compared to TransMIL, LiteMIL significantly reduces training time and GPU memory usage. Ablation studies confirmed the critical role of the learnable query and layer normalization in enhancing performance and stability. ConclusionLiteMIL offers a resource-efficient, robust solution. Its streamlined architecture, combined with the compact Phikon features, makes it suitable for integrating into routine histopathological workflows, particularly in resource-limited settings.

10

XKidneyOnco: An Explainable Framework to Classify Renal Oncocytoma and Chromophobe Renal Cell Carcinoma with a Small Sample Size

Javaheri, T.; Yang, X.; Yerra, S.; Seidi, K.; Setayesh, T.; Zhang, G.; Chitkushev, L.; Salar, S. S.; Sayeeduddin, Z.; Zarrin-Khameh, N.; Gharib, M. H.; Castro, P.; Haeri, M.; Rawassizadeh, R.

2024-01-25 pathology 10.1101/2024.01.23.576782 medRxiv

Top 0.1%

40.0%

Show abstract

Renal oncocytoma and chromophobe renal cell carcinoma are two kidney cancer types that present a diagnostic challenge to pathologists and other clinicians due to their microscopic similarities. While RO is a benign renal neoplasm, ChRCC is considered malignant. Therefore, the differentiation between the two is crucial. In this study, we introduce an explainable framework to accurately differentiate ChRCC from RO, histologically. Our approach examined H&E-stained images of 656 ChRCC and 720 RO, and achieved a diagnostic accuracy of 88.2%, the sensitivity of 87%, and 100% specificity for explainable AI, which either outperforms or operate on par with convolutional neural network (CNN) models. Besides, we enrolled 44 pathology experts (including pathologists and pathology trainees) to differentiate the two tumors. The average accuracy of pathologists was 73%, which is 15.2% lower than our framework. These results indicate that the combination of human expert along with explainable AI achieve higher accuracy in differentiating the two tumors, while it reduces the workload of experts and offers the desired explainability for the medical experts.

11

A deep learning-based iterative digital pathology annotation tool

Jaber, M. I.; Song, B.; Beziaeva, L.; Szeto, C. W.; Spilman, P.; Yang, P.; Soon-Shiong, P.

2021-08-24 pathology 10.1101/2021.08.23.457396 medRxiv

Top 0.1%

39.7%

Show abstract

Well-annotated exemplars are an important prerequisite for supervised deep learning schemes. Unfortunately, generating these annotations is a cumbersome and laborious process, due to the large amount of time and effort needed. Here we present a deep-learning-based iterative digital pathology annotation tool that is both easy to use by pathologists and easy to integrate into machine vision systems. Our pathology image annotation tool greatly reduces annotation time from hours to a few minutes, while maintaining high fidelity with human-expert manual annotations. Here we demonstrate that our active learning tool can be used for a variety of pathology annotation tasks including masking tumor, stroma, and lymphocyte-rich regions, among others. This annotation automation system was validated on 90 unseen digital pathology images with tumor content from the CAMELYON16 database and it was found that pathologists gold standard masks were re-produced successfully using our tool. That is, an average of 2.7 positive selections (mouse clicks) and 8.0 negative selections (mouse clicks) were sufficient to generate tumor masks similar to pathologists gold standard in CAMELYON16 test WSIs. Furthermore, the developed image annotation tool has been used to build gold standard masks for hundreds of TCGA digital pathology images. This set was used to train a convolutional neural network for identification of tumor epithelium. The developed pan-cancer deep neural network was then tested on TCGA and internal data with comparable performance. The validated pathology image annotation tool described herein has the potential to be of great value in facilitating accurate, rapid pathological analysis of tumor biopsies.

12

Adaptive Example Selection: Prototype-Based Explainability for Interpretable Mitosis Detection

Banik, M.; Kreutz-Delgado, K.; Mohanty, I.; Brown, J. B.; Singh, N.

2025-03-11 pathology 10.1101/2025.03.05.641711 medRxiv

Top 0.1%

39.1%

Show abstract

Understanding the decision-making process of black-box neural network classifiers is crucial for their adoption in medical applications, including histopathology and cancer diagnostics. An approach of increasing interest is to clarify how the decisions of neural networks compare to, and perform parallel to, those of highly-trained and knowledgeable clinicians and other medical professionals within their prototypical classes of interest. Motivated by this, we introduce Adaptive Example Selection (AES), a prototype-based explainable AI (XAI) framework that facilitates the interpretability of deep learning models for mitosis detection. AES works by selecting and presenting a small set of real-world mitotic images most informative to a given classification, allowing pathologists to visually assess and understand the neural networks decision by comparing test cases with similar previously annotated examples. AES achieves this by expanding the neural networks confidence/belief function and fitting it to a radial basis function (RBF) approximator, an approach we term Decision Boundary-based Analysis (DBA). This method makes the decision boundary more transparent, offering robust visual insights into the models decisions, and thereby equipping clinicians with the information needed to effectively utilize AI-driven diagnostics. Additionally, AES includes customizable user controls, allowing clinicians to tailor decision thresholds and select prototype examples to better align with their specific diagnostic needs. This flexibility empowers users to engage with the AI model more directly and meaningfully, increasing its practical relevance in clinical settings.

13

Visualizing Decisions and Analytics of Artificial Intelligence based Cancer Diagnosis and Grading of Specimen Digitized Biopsy: Case Study for Prostate Cancer

Amal, S.; Singh, A.; Harrison, L.; Breggia, A.; Christman, R.; Winslow, R. L.; Wan, M.

2022-12-22 pathology 10.1101/2022.12.21.22283754 medRxiv

Top 0.1%

38.7%

Show abstract

1The rise in Artificial Intelligence (AI) and deep learning research has shown great promise in diagnosing prostate cancer from whole slide image biopsies. Intelligent application interface for diagnosis is a progressive way to communicate AI results in the medical domain for practical use. This paper aims to suggest a way to integrate state-of-the-art deep learning algorithms into a web application for visualizations of decisions and analytics of an AI based algorithms applied on cancer digitized specimen biopsies together with visualizing evidence and explanation of the decision using both image from the biopsy and textual data from Electronic Health Records (EHR). By creating smart visualizations of tissue biopsy images, from magnified regions to augmented sharper images along with image masks that highlight cancerous regions of tissue in addition to intelligent analytics and distribution charts related to cancer prediction, we aim to communicate these easily interpretable results to assist pathologists and concerned medical team to make better decisions for prostate cancer diagnosis as case study.

14

Class-Controlled Copy-Paste Based Cell Segmentation for CoNIC Challenge

Ahn, H.; Hong, Y.

2022-03-04 pathology 10.1101/2022.03.02.482203 medRxiv

Top 0.1%

38.4%

Show abstract

Muti-class cell segmentation in histopathology images is a challenging task. Here, we propose a copy-paste augmentation-based method for CoNIC challenge. As the challenge train data is severely class imbalanced. To deal with it, we copy all cell objects of train data and paste them to the train image on the fly while training model. The paste strategy is that we paste more cell objects of the insufficient classes and paste less cell objects for the sufficient classes. We experimented the method by stratified splitting train data in 4:1 ratio, the result shows the copy paste method can reach PQ 64.84 and mPQ 53.72, which improved and 0.66 compared to without copy pasted. Moreover, the improvements in those insufficient classes is more obvious.

15

Model for Ki-67 Proliferation Index Predicton, Trained End-to-End on Routine Diagnostic Data

Kukucka, A.; Obdrzalek, J.; Musil, V.; Nenutil, R.; Holub, P.; Brazdil, T.

2025-06-26 pathology 10.1101/2025.06.26.25330333 medRxiv

Top 0.1%

38.2%

Show abstract

For breast cancer, the Ki-67 index gives important information on the patients prognosis and may predict the response to therapy. However, semi-automated methods for Ki-67 index calculation are prone to intra-and inter-observer variability, while fully automated machine learning models based on nuclei segmentation, classification and counting require training on large datasets with precise annotations down to the level of individual nuclei, which are hard to obtain. We design a neural network that straightforwardly predicts the Ki-67 index from scans of H&DAB-stained tissue samples. The network is trained only on existing data from daily operations at Masaryk Memorial Cancer Institute, Brno. The image labels contain only the Ki-67 index without any tumour epithelium or nuclei annotations. We use a simple convolutional neural network, not biasing the network by incorporation of layers dedicated to epithelium or nuclei segmentation or classification. Our models predictions align with the state-of-the-art evaluation by pathologists using QuPath image analysis with manual tumour annotation. On a test set consisting of 1250 images, the model achieved the mean absolute error of 3.668 and Pearsons correlation coefficient of 0.959 (p < 0.001). Surprisingly, despite using a simple architecture and very weak supervision, the model persuasively detects complex morphological structures such as tumour epithelium. The model also works on Whole Slide Image data, e.g. to detect the hotspot areas. Since our approach does not need any specifically labelled data or additional staining, it is cost-effective and allows easy domain adaptation.

16

Deep Learning in Automating Breast Cancer Diagnosis from Microscopy Images

Gu, Q.; Prodduturi, N.; Hart, S. N.

2023-06-16 pathology 10.1101/2023.06.15.23291437 medRxiv

Top 0.1%

37.5%

Show abstract

ContextBreast cancer is one of the most common cancers in women. With early diagnosis, some breast cancers are highly curable. However, the concordance rate of breast cancer diagnosis from histology slides by pathologists is unacceptably low. Classifying normal versus tumor breast tissues from microscopy images of breast histology is an ideal case to use for deep learning and could help to more reproducibly diagnose breast cancer. Since data preprocessing and hyperparameter configurations have impacts on breast cancer classification accuracies of deep learning models, training a deep learning classifier with appropriate data preprocessing approaches and optimized hyperparameter configurations could improve breast cancer classification accuracy. Methods and MaterialUsing 12 combinations of deep learning model architectures (i.e., including 5 non-specialized and 7 digital pathology-specialized model architectures), image data preprocessing, and hyperparameter configurations, the validation accuracy of tumor versus normal classification were calculated using the BreAst Cancer Histology (BACH) dataset. ResultsThe DenseNet201, a non-specialized model architecture, with transfer learning approach achieved 98.61% validation accuracy compared to only 64.00% for the digital pathology-specialized model architecture. ConclusionsThe combination of image data preprocessing approaches and hyperparameter configurations have a profound impact on the performance of deep neural networks for image classification. To identify a well-performing deep neural network to classify tumor versus normal breast histology, researchers should not only focus on developing new models specifically for digital pathology, since hyperparameter tuning for existing deep neural networks in the computer vision field could also achieve a high (often better) prediction accuracy.

17

Comparison of the classification of HER2 from whole-slide images between pathologists and a deep learning model

Tsuneki, M.; Abe, M.; Kanavati, F.

2023-03-29 pathology 10.1101/2023.03.29.23287897 medRxiv

Top 0.1%

35.4%

Show abstract

HER2 (human epidermal growth factor receptor 2) is a protein that is found on the surface of some cells, including breast cells. HER2 plays a role in cell growth, division, and repair, and when it is overexpressed, it can contribute to the development of certain types of cancer, particularly breast cancer. HER2 overexpression occurs in approximately 20% of cases, and it is associated with more aggressive tumor phenotypes and poorer prognosis. This makes its status an important factor in determining treatment options for breast cancer. While HER2 expression is typically diagnosed through a combination of immunohistochemistry (IHC) and/or fluorescence in situ hybridization (FISH) testing on breast cancer tissue samples, we sought to determine to what extent it is possible to diagnose from H&E-stained specimens. To this effect we trained a deep learning model to classify HER2-positive image patches using a dataset of 10 whole-slide images (5 HER2-positive, 5 HER2-negative). We evaluated the model on a different test set consisting of patches extracted from 10 WSIs (5 HER2-positive, 5 HER2-negative), and we compared the performance against two pathologists on 100 512x512 patches (50 HER2-positive, 50 HER2-negative). Overall, the model achieved an accuracy of 73% while the pathologists achieved 58% and 47%, respectively.

18

Cross-institutional HER2 assessment via a computer-aided system using federated learning and stain composition augmentation

Yang, C.-H.; Chen, Y.-A.; Chang, S.-Y.; Hsieh, Y.-H.; Hung, Y.-L.; Lin, Y.-W.; Lee, Y.-H.; Lin, C.-H.; Lin, Y.-C.; Lu, Y.-S.; Lin, Y.-Y.

2024-01-22 pathology 10.1101/2024.01.17.576160 medRxiv

Top 0.1%

35.3%

Show abstract

The rapid advancement of precision medicine and personalized healthcare has heightened the demand for accurate diagnostic tests. These tests are crucial for administering novel treatments like targeted therapy. To ensure the widespread availability of accurate diagnostics with consistent standards, the integration of computer-aided systems has become essential. Specifically, computer-aided systems that assess biomarker expression have thrusted through the widespread application of deep learning for medical imaging. However, the generalizability of deep learning models has usually diminished significantly when being confronted with data collected from different sources, especially for histological imaging in digital pathology. It has therefore been challenging to effectively develop and employ a computer-aided system across multiple medical institutions. In this study, a biomarker computer-aided framework was proposed to overcome such challenges. This framework incorporated a new approach to augment the composition of histological staining, which enhanced the performance of federated learning models. A HER2 assessment system was developed following the proposed framework, and it was evaluated on a clinical dataset from National Taiwan University Hospital and a public dataset coordinated by the University of Warwick. This assessment system showed an accuracy exceeding 90% for both institutions, whose generalizability outperformed a baseline system developed solely through the clinical dataset by 30%. Compared to previous works where data across different institutions were mixed during model training, the HER2 assessment system achieved a similar performance while it was developed with guaranteed patient privacy via federated learning.

19

Transfer Learning for Inference of Metastatic Origin from Whole Slide Histology

Schau, G.; Ghani, H.; Burlingame, E. A.; Thibault, G.; Gray, J. W.; Corless, C.; Chang, Y. H.

2021-04-22 pathology 10.1101/2021.04.21.440864 medRxiv

Top 0.1%

35.3%

Show abstract

Accurate diagnosis of metastatic cancer is essential for prescribing optimal control strategies to halt further spread of metastasizing disease. While pathological inspection aided by immunohistochemistry staining provides a valuable gold standard for clinical diagnostics, deep learning methods have emerged as powerful tools for identifying clinically relevant features of whole slide histology relevant to a tumors metastatic origin. Although deep learning models require significant training data to learn effectively, transfer learning paradigms provide mechanisms to circumvent limited training data by first training a model on related data prior to fine-tuning on smaller data sets of interest. In this work we propose a transfer learning approach that trains a convolutional neural network to infer the metastatic origin of tumor tissue from whole slide images of hematoxylin and eosin (H&E) stained tissue sections and illustrate the advantages of pre-training network on whole slide images of primary tumor morphology. We further characterize statistical dissimilarity between primary and metastatic tumors of various indications on patch-level images to highlight limitations of our indication-specific transfer learning approach. Using a primary-to-metastatic transfer learning approach, we achieved mean class-specific areas under receiver operator characteristics curve (AUROC) of 0.779, which outperformed comparable models trained on only images of primary tumor (mean AUROC of 0.691) or trained on only images of metastatic tumor (mean AUROC of 0.675), supporting the use of large scale primary tumor imaging data in developing computer vision models to characterize metastatic origin of tumor lesions.

20

Accurate diagnosis achieved via super-resolution whole slide images by pathologists and artificial intelligence

wang, k.; LIU, R.; chen, Y.; wang, y.; qiu, y.; gao, y.; zhou, m.; bai, b.; zhang, m.; sun, k.; Deng, H.-W.; xiao, h.; Yu, G.

2024-07-07 pathology 10.1101/2024.07.05.24310022 medRxiv

Top 0.1%

34.7%

Show abstract

BackgroundDigital pathology significantly improves diagnostic efficiency and accuracy; however, pathological tissue sections are scanned at high resolutions (HR), magnified by 40 times (40X) incurring high data volume, leading to storage bottlenecks for processing large numbers of whole slide images (WSIs) for later diagnosis in clinic and hospitals. MethodWe propose to scan at a magnification of 5 times (5X). We developed a novel multi-scale deep learning super-resolution (SR) model that can be used to accurately computes 40X SR WSIs from the 5X WSIs. ResultsThe required storage size for the resultant data volume of 5X WSIs is only one sixty-fourth (less than 2%) of that of 40X WSIs. For comparison, three pathologists used 40X scanned HR and 40X computed SR WSIs from the same 480 histology glass slides spanning 47 diseases (such tumors, inflammation, hyperplasia, abscess, tumor-like lesions) across 12 organ systems. The results are nearly perfectly consistent with each other, with Kappa values (HR and SR WSIs) of 0.988{+/-}0.018, 0.924{+/-}0.059, and 0.966{+/-}0.037, respectively, for the three pathologists. There were no significant differences in diagnoses of three pathologists between the HR and corresponding SR WSIs, with Area under the Curve (AUC): 0.920{+/-}0.164 vs. 0.921{+/-}0.158 (p-value=0.653), 0.931{+/-}0.128 vs. 0.943{+/-}0.121 (p-value=0.736), and 0.946{+/-}0.088 vs. 0.941{+/-}0.098 (p-value=0.198). A previously developed highly accurate colorectal cancer artificial intelligence system (AI) diagnosed 1,821 HR and 1,821 SR WSIs, with AUC values of 0.984{+/-}0.016 vs. 0.984{+/-}0.013 (p-value=0.810), again with nearly perfect matching results. ConclusionsThe pixel numbers of 5X WSIs is only less than 2% of that of 40X WSIs. The 40X computed SR WSIs can achieve accurate diagnosis comparable to 40X scanned HR WSIs, both by pathologists and AI. This study provides a promising solution to overcome a common storage bottleneck in digital pathology.